这项工作介绍了使用伪层作为费米子决定因素的随机估计量的费米子晶状体理论中基于流动采样的量规均值架构。这是最先进的晶格场理论计算中的默认方法,这使得对流向模型在QCD等理论的实际应用至关重要。还概述了通过标准技术(例如/奇数预处理和HasenBusch分解)来改进基于流的采样方法的方法。提供了二维U(1)和SU(3)具有$ n_f = 2 $ FERMIONS的量规理论的数值演示。
translated by 谷歌翻译
我们提出了一种基于标准化流动的机器学习方法,用于建模原子固体。我们的模型将一个分析的易诊断分布转换为目标固体,而无需进行地面真实样品进行培训。我们向赫尔莫霍尔茨自由能量估算报告为单立方和六角形冰,如解象水,以及截断的leennard-jones系统,并发现它们与文学价值观的良好协议以及既定基线方法的估计。我们进一步研究了结构性,并表明模型样品几乎与分子动力学所获得的模型难以区分。因此,我们的结果表明,标准化流动可以提供高质量的样品和固体的自由能估计,而无需多阶段或用于对晶体几何体施加的限制。
translated by 谷歌翻译
基于标准化流的算法是由于有希望的机器学习方法,以便以可以使渐近精确的方式采样复杂的概率分布。在格子场理论的背景下,原则上的研究已经证明了这种方法对标量理论,衡量理论和统计系统的有效性。这项工作开发了能够使用动力学蜕皮的基于流动的理论采样的方法,这对于应用于粒子物理标准模型和许多冷凝物系的晶格场理论研究是必要的。作为一种实践演示,这些方法应用于通过Yukawa相互作用耦合到标量场的无大量交错的费米子的二维理论的现场配置的采样。
translated by 谷歌翻译
Existing analyses of neural network training often operate under the unrealistic assumption of an extremely small learning rate. This lies in stark contrast to practical wisdom and empirical studies, such as the work of J. Cohen et al. (ICLR 2021), which exhibit startling new phenomena (the "edge of stability" or "unstable convergence") and potential benefits for generalization in the large learning rate regime. Despite a flurry of recent works on this topic, however, the latter effect is still poorly understood. In this paper, we take a step towards understanding genuinely non-convex training dynamics with large learning rates by performing a detailed analysis of gradient descent for simplified models of two-layer neural networks. For these models, we provably establish the edge of stability phenomenon and discover a sharp phase transition for the step size below which the neural network fails to learn "threshold-like" neurons (i.e., neurons with a non-zero first-layer bias). This elucidates one possible mechanism by which the edge of stability can in fact lead to better generalization, as threshold neurons are basic building blocks with useful inductive bias for many tasks.
translated by 谷歌翻译
We introduce the XPER (eXplainable PERformance) methodology to measure the specific contribution of the input features to the predictive or economic performance of a model. Our methodology offers several advantages. First, it is both model-agnostic and performance metric-agnostic. Second, XPER is theoretically founded as it is based on Shapley values. Third, the interpretation of the benchmark, which is inherent in any Shapley value decomposition, is meaningful in our context. Fourth, XPER is not plagued by model specification error, as it does not require re-estimating the model. Fifth, it can be implemented either at the model level or at the individual level. In an application based on auto loans, we find that performance can be explained by a surprisingly small number of features. XPER decompositions are rather stable across metrics, yet some feature contributions switch sign across metrics. Our analysis also shows that explaining model forecasts and model performance are two distinct tasks.
translated by 谷歌翻译
We introduce a parametric view of non-local two-step denoisers, for which BM3D is a major representative, where quadratic risk minimization is leveraged for unsupervised optimization. Within this paradigm, we propose to extend the underlying mathematical parametric formulation by iteration. This generalization can be expected to further improve the denoising performance, somehow curbed by the impracticality of repeating the second stage for all two-step denoisers. The resulting formulation involves estimating an even larger amount of parameters in a unsupervised manner which is all the more challenging. Focusing on the parameterized form of NL-Ridge, the simplest but also most efficient non-local two-step denoiser, we propose a progressive scheme to approximate the parameters minimizing the risk. In the end, the denoised images are made up of iterative linear combinations of patches. Experiments on artificially noisy images but also on real-world noisy images demonstrate that our method compares favorably with the very best unsupervised denoisers such as WNNM, outperforming the recent deep-learning-based approaches, while being much faster.
translated by 谷歌翻译
The French National Institute of Geographical and Forest Information (IGN) has the mission to document and measure land-cover on French territory and provides referential geographical datasets, including high-resolution aerial images and topographic maps. The monitoring of land-cover plays a crucial role in land management and planning initiatives, which can have significant socio-economic and environmental impact. Together with remote sensing technologies, artificial intelligence (IA) promises to become a powerful tool in determining land-cover and its evolution. IGN is currently exploring the potential of IA in the production of high-resolution land cover maps. Notably, deep learning methods are employed to obtain a semantic segmentation of aerial images. However, territories as large as France imply heterogeneous contexts: variations in landscapes and image acquisition make it challenging to provide uniform, reliable and accurate results across all of France. The FLAIR-one dataset presented is part of the dataset currently used at IGN to establish the French national reference land cover map "Occupation du sol \`a grande \'echelle" (OCS- GE).
translated by 谷歌翻译
Reflection high-energy electron diffraction (RHEED) is a powerful tool in molecular beam epitaxy (MBE), but RHEED images are often difficult to interpret, requiring experienced operators. We present an approach for automated surveillance of GaAs substrate deoxidation in MBE reactors using deep learning based RHEED image-sequence classification. Our approach consists of an non-supervised auto-encoder (AE) for feature extraction, combined with a supervised convolutional classifier network. We demonstrate that our lightweight network model can accurately identify the exact deoxidation moment. Furthermore we show that the approach is very robust and allows accurate deoxidation detection during months without requiring re-training. The main advantage of the approach is that it can be applied to raw RHEED images without requiring further information such as the rotation angle, temperature, etc.
translated by 谷歌翻译
近年来,关于如何在公平限制下学习机器学习模型的越来越多的工作,通常在某些敏感属性方面表达。在这项工作中,我们考虑了对手对目标模型具有黑箱访问的设置,并表明对手可以利用有关该模型公平性的信息,以增强他对训练数据敏感属性的重建。更确切地说,我们提出了一种通用的重建校正方法,该方法将其作为对手进行的初始猜测,并纠正它以符合某些用户定义的约束(例如公平信息),同时最大程度地减少了对手猜测的变化。提出的方法对目标模型的类型,公平感知的学习方法以及对手的辅助知识不可知。为了评估我们的方法的适用性,我们对两种最先进的公平学习方法进行了彻底的实验评估,使用四个具有广泛公差的不同公平指标以及三个不同大小和敏感属性的数据集。实验结果证明了提出的方法改善训练集敏感属性的重建的有效性。
translated by 谷歌翻译
为了减轻模型中不希望的偏差的影响,几种方法建议预先处理输入数据集,以通过防止敏感属性的推断来减少歧视风险。不幸的是,这些预处理方法中的大多数导致一代新分布与原始分布有很大不同,因此通常导致不切实际的数据。作为副作用,这种新的数据分布意味着需要重新训练现有模型才能做出准确的预测。为了解决这个问题,我们提出了一种新颖的预处理方法,我们将根据保护组的分布转换为所选目标一个,并具有附加的隐私约束,其目的是防止敏感敏感的推断属性。更确切地说,我们利用Wasserstein Gan和Attgan框架的最新作品来实现数据点的最佳运输以及强制保护属性推断的歧视器。我们提出的方法可以保留数据的可解释性,并且可以在不定义敏感组的情况下使用。此外,我们的方法可以专门建模现有的最新方法,从而提出对这些方法的统一观点。最后,关于真实和合成数据集的一些实验表明,我们的方法能够隐藏敏感属性,同时限制数据的变形并改善了后续数据分析任务的公平性。
translated by 谷歌翻译